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Abstract: Agonist binding is related to a series of motions in G protein-coupled receptors 
(GPCRs) that result in the separation of transmembrane helices III and VI at their cytosolic 
ends and subsequent G protein binding. A large number of smaller motions also seem to be 
associated with activation. Most helices in GPCRs are highly irregular and often contain 
kinks, with extensive literature already available about the role of prolines in kink 
formation and the precise function of these kinks. GPCR transmembrane helices also 
contain many a-bulges. In this article we aim to draw attention to the role of these a-bulges 
in ligand and G-protein binding, as well as their role in several aspects of the mobility 
associated with GPCR activation. This mobility includes regularization and translation of 
helix III in the extracellular direction, a rotation of the entire helix VI, an inward 
movement of the helices near the extracellular side, and a concerted motion of the cytosolic 
ends of the helices that makes their orientation appear more circular and that opens up 
space for the G protein to bind. In several cases, a-bulges either appear or disappear as part 
of the activation process. 

Keywords: GPCR; 7i-helix; a-bulge; GPCR activation; random forest; structure-function 



Int. J. Mol Sci. 2014, 15 



7842 



1. Introduction 

G protein-coupled receptors (GPCRs) are important targets for the pharmaceutical industry and 
have consequently been studied extensively in vivo, in vitro, and in silico. Their importance is 
illustrated by the fact that PubMed [1] lists around 500 reviews relating to GPCRs every year. The first 
three-dimensional structure of a GPCR was solved in 2000 [2], and recent years have seen a flurry of 
GPCR structures [3] being solved, published and deposited in the PDB (Protein Data Bank) [4]. Nearly 
all the GPCR structures solved so far are from the rhodopsin-like family — normally referred to as the 
Class A GPCR family. This article exclusively examines Class A GPCRs, so each time the acronym 
GPCR is used, it should be interpreted as "Class A GPCR". 

GPCRs possess seven transmembrane helices that traverse the membrane as illustrated in Figure 1 . 
The TV-terminus is located extracellularly and the C-terminus is located in the cytosol. Each helix 
contains at least one highly conserved residue (see Figure 1). These conserved residues are commonly 
used to anchor GPCR sequence alignments using the soon to be abandoned concept that "there are no 
insertions and deletions in helices" [5,6]. 

Figure 1. Overview of G protein-coupled receptors (GPCR) helix bundle with conserved 
residues indicated. In most GPCR sequences, we find the following conserved residues 
(first digit indicates the helix number; see the GPCRDB [5,6] or other GPCR systems 
like the Glycoprotein-hormone Receptor Information System (GRIS) [7] for a detailed 
discussion of the numbering system): Glyl29, Asnl30, Leu220, Asp224, Asn729, Pro730 
(in the ion pocket that is involved in signalling between the ligand binding site and the 
G-protein binding site); Cys315 (which forms a Cys-Cys bridge with Cys446 in the loop 
between helices IV and V), Asp339, Arg340, Tyr341 (the DRY motif, involved in 
activation), and Tyr528 (involved in G protein interactions); Trp 420 (purple, just visible 
behind the ion pocket residues; likely involved in cholesterol binding, perhaps involved in 
dimer contacts), Pro520, Cys617, Trp618, Pro620 (involved in ligand interaction and 
trigger and/or hinge for motions needed for activation) Tyr733 (just underneath the ion 
pocket, swings from proximity of ion pocket into direction of DRY upon activation). 
Rhodopsin (PDBid = lf88 [2]) has been used for this figure. 
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In the year 2000, the first structure of a GPCR was solved experimentally [2]. This structure not 
only revealed a series of surprises but also confirmed [8] or falsified [9] a series of previous 
hypotheses. Other hypotheses [10] were proven right in concept but wrong in detail. After the second 
GPCR structure had been solved (the (32 adrenergic receptor [11]), the floodgates opened, with the 
result that over the last few years we have on average been able to study one new GPCR structure 
every few months. The most frequent source of newly solved GPCRs has been the Stevens Lab [3]. 

GPCRs are notoriously difficult to crystallize. They are membrane proteins and therefore have large 
hydrophobic surfaces that are prone to nonspecific binding. However, it is perhaps even more 
important to realize that most GPCRs exhibit constitutive activity, which means they are highly 
mobile, continuously moving in and out of the active state. Consequently, GPCR crystallographers not 
only need to cope with the classical membrane crystallization problem of sticky hydrophobic surfaces, 
they also need to fixate the GPCR in one of its many marginally stable conformations. The sticky 
surface problem is frequently addressed by adding amphiphiles, while increasing the hydrophilic 
surface is often done by cloning lysozyme covalently into the GPCR. This covalent cloning is typically 
done between helices V and VI (e.g., as in the (32 adrenoceptor [11]), but lysozyme has also been 
cloned-in at other locations (e.g., the (32 adrenoceptor in which lysozyme has been fused to the 
TV-terminus [12]). Other molecules are also being used for this purpose [13]. The GPCR mobility issue 
has been reduced by adding strong binding ligands (e.g., PDBid = 3eml [14]), nanobodies 
(e.g., PDBid = 3p0g [15]), G proteins (e.g., PDBid = 3sn6 [11]), and by mutating the residues that are 
most involved in the activation process — for example, the (31 adrenoceptor (PDBid = 2vt4 [16]) or the 
adenosine A2A receptor (PDBid = 3rey [17]). All these modifications have an influence on the 
structure. Figure 2 shows a series of examples of non-native interactions observed in GPCR structures. 
An associated web page [18] holds pictures of the 84 GPCR structures presently available with the 
modifications and non-native interactions indicated. 

When a GPCR structure deviates from the canonical situation shown in Figure 1, it is important to 
consider whether the deviation is an interesting finding or a crystallization artifact. For example, in the 
recently solved structure of the NTS1 neurotensin receptor (PDBid = 4grv [19]) we observed that helix 
VIII is barely present and what can be seen of it appears to be displaced. This could be an interesting 
finding, but it seems more likely that a large number of crystal packing contacts (see Figure 3) has 
caused helix I to occupy the canonical space of helix VIII, causing helix VIII to find another stable 
position packing against the cloned-in lysozyme structure [19]. 

Mason et al. [20] found that upon receptor activation, the volume of the ligand binding site 
decreased by -40 A 3 for the (31 and (32 adrenoceptors and by -90 A 3 for the purinergic A2A receptor. 
This indicates that agonists stabilize a receptor conformation in which the extracellular sides of certain 
helices are closer together. This finding, combined with the finding of the "see-saw" like motions of 
helices [21,22], results in a model that can be compared to the mechanism of a clothespin. This model 
is explained in Figure 4. 
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Figure 2. Examples of crystallization artifacts. (A) The turkey (31 -adrenoceptor, 
PDBid = 2vt4 [16]. The asymmetric unit contains a dimer of non-natural up-side-down 
dimers. The A subunit in the first dimer is shown in blue while the other three monomers 
are shown in purple. Crystallization additives are shown in a ball representation. Residues 
in other molecules in the crystal that have at least one atom within 10 A from any of the 
four subunits in the asymmetric unit are shown as a purple stick model; (B) Ribbon 
representation of bovine rhodopsin, PDBid = lf88 [2]. All residue positions mutated for 
thermostabilization in any of the structures mentioned in this article are shown in a red 
ball-representation; and (C) Trace representation of the (32 adrenoceptor, PDBid = 2r4s [23] 
shown in dark blue bound to an antibody shown in magenta. The (32 adrenoceptor with 
PDBid = 3ny9 [12] that does not bind anything in this region is superposed and shown in 
light blue. Both are bound to inverse agonists. The antibody distorts helix VIII resulting in 
a bulge in 2r4s (shown in red) followed by an anti-bulge (3io helix) at the location where 
3ny9 has a normal helix (shown in green). 




(C) 

GPCR structures have been modified in many different ways to aid crystallization, but fortunately, 
because many different crystallization methods have been used, we can still extract an overall picture 
by observing trends in large numbers of GPCR structures. After dealing with sequence insertions and 
deletions caused by a-bulges and short stretches of 7i-helix and 3io-helix, we can unambiguously assign 
203 residues that are common to all GPCR structures, all of which we can be reasonably sure are 
homologous. We have measured the 20,503 pair-wise distances between these 203 residues in 
69 GPCR structures, and we have analyzed these distances using the random forest (RF) method. 
One of the scientific possibilities offered by the RF method when applied to a very large set of 
observations is detection of those observations that are most supportive for a hypothesized 
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classification. We produced a series of structure classifications, each consisting of either two or three 
groups. Examples are: dark-state rhodopsins versus other receptors in the inactive state; receptors with 
a G protein or nanobody bound at the cytosolic side versus all others; and adenosine receptors with 
bound inverse agonists, antagonists or agonists. 

Figure 3. The structures of the neurotensin 1 (NTS1) receptor (red; PDBid = 4grv [19]) 
and rhodopsin (blue; PDBid = lf88 [2]). The lysozyme that is cloned between helices V 
and VI in NTS1 is partly shown (in yellow). Helix VIII is shown in dark blue in rhodopsin 
and in green in NTS1. The purple lines indicate crystal packing contacts made by helix I 
in NTS1. 




The results shed new light on the GPCR activation process, on the roles of many individual 
residues, and on the helix motions. They also shed new light on a-bulges and their role in the dynamic 
processes related to GPCR activation. Upon activation (either by G-protein binding or by other means) 
helices III and VI separate at the cytosolic side, while helix III translates slightly in the extracellular 
direction [18]. 

Helical distortions play a major role in the overall fold of GPCRs. Most GPCRs possess 
a-bulges [24-30] in helices II and V, but they are also observable in most other helices in one or more 
GPCRs; only in the lipid receptor structures did we not detect any a-bulges. The highly conserved 
a-bulges often mentioned in helices II and V are not present in all GPCR structures, and when they are 
present, their locations are not always strictly conserved. In the second transmembrane segment 
(TM2), for example, the proline pattern is not conserved in sequence space. When present, it may be 
located at position 232, 233, or 234. In 2009 Deville et al. [29] proposed that an indel led to two 
structural motifs for helix 2: bulged receptors with a proline at position 233 or 234 or a kink in 
receptors with a proline at position 232. In the structures available today, though, the proline in TM2 is 
structurally conserved and the observed sequence variability is caused by the absence or presence of an 
a-bulge. This sequence motif is particularly important for predicting the presence of bulges near the 
ligand-binding site, which in turn is crucial for homology modeling. Figure 5 shows how in the trace 
amine subfamily in the GPCRDB [5] a bulge is observed near position 229 in about half of all 
family members. 
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a-Bulges tend to remain stable in molecular dynamics simulations, indicating that these bulges 
represent (at least) a local energy minimum (DE Gloriam, personal communication). 

Figure 4. Mechanism of GPCR activation. Agonist binding (1) induces inward motions 
(2) of the extracellular side of helices V-VII. This is accompanied by an outward 
movement of the cytosolic side of helices V-VII (3), allowing the G protein (shown as 
solid blue blob) to bind (4) and become activated (5). Obviously, there is no fixed order in 
the motions. The determination of what moves where depends largely on the superposition 
method used. 




Figure 5. Fourteen consecutive residues in helix II (starting at the conserved L at position 
220) extracted from the trace amine sequence alignment from the GPCRDB. From left to 
right the columns contain the GPCRDB sequence number, the consensus sequence at that 
position, and the amino acid at that position in each of the 61 sequences in the GPCRDB 
trace amine family 16. Figure copied with permission from Isberg et al. [31]. 
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We will argue that the role of prolines near the kinks in the helices is different to that 
previously thought. Overall, our results lead to a series of new hypotheses that are amenable to 
experimental validation. 

2. Results 

The way we use the RF method to analyze GPCR structure characteristics in distance space requires 
that the hypotheses put forward logically lead to classifications of existing GPCR structures involving 
a limited number of groups. Any hypothesis related to the activation process will be a good candidate, 
because the GPCR structure community has been working hard to shed light on this process by 
solving the structures of GPCRs with bound agonists, partial agonists, etc. and GPCR structures in the 
presence and absence of G proteins. 

Visual inspection reveals that residue Tyr733 is displaced the most between active and inactive 
structures, with widening of the gap between helices III and VI also being closely associated with 
activation. In total, 202 distance vectors relate to displacement of the Tyr733 residue. However, 
almost eight times this number of vectors contribute to the relative displacement of helices III and VI. 
This asymmetry between the number of distance vectors related to displacement of Tyr733 and the 
number of distance vectors related to displacement of helices III and VI causes artifacts in the RF 
computations [32,33]. We therefore iteratively searched for the pair of distance vectors that had the 
highest Pearson correlation coefficient, randomly removing one of the two vectors during each 
iteration. This process was repeated until no two distance vectors had a Pearson correlation coefficient 
higher than 0.90, resulting in removal of around 80% of the vectors. The RF determination of the 
distance vectors most representative for the differences between active and inactive structures are 
projected on 1F88 in Figure 6. The yellow lines in Figure 6 represent distances that are shorter in the 
active state GPCRs than in the inactive ones. Most of the yellow vectors in Figure 6 are caused by 
a small but systematic displacement of helix III towards the extracellular side in the active state 
structures. The magenta lines represent distances that increase upon activation. These mainly involve 
the cytosolic side of helix VI, which undergoes an outward motion away from the rest of the helix 
bundle upon activation. 

Tyr733, which is located at the intracellular side of helix VII, is the most important single residue 
to classify receptors as either being in the active state or the inactive state. It has been 
hypothesized [2,8,34-36] that this highly conserved residue stabilizes the active state by stabilizing the 
open conformation of the "ionic lock" formed by Arg340 and Glu600. During activation, the local 
backbone around Tyr733 undergoes a complicated reorganization, part of which involves the 
appearance/disappearance of an a-bulge. 

When we use RF to determine the most significant distance differences between active and inactive 
GPCR structures, we find that many of these involve a residue in helix III. Visual inspection of 
superposed structures in the active and inactive state reveals that helix III is slightly displaced towards 
the extracellular direction in the activated GPCRs. Helix III is also slightly more wound up in the 
activated state. We also observe a correlation between truncation of the TV-terminus and an outward 
movement of the extracellular side of helix I, as illustrated in Figure 7. 
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Figure 6. Distances found to be important by the random forest method to separate active 
from inactive structures mapped on rhodopsin (PDBid = lf88 [2]) looked at from the 
intracellular side. The distances indicated in magenta increase upon activation; distances 
indicated in yellow decrease upon activation. 




Although many articles have been published about the role of prolines in inducing kinks in helices 
(e.g., [37-43]), we do not believe that prolines actively induce these kinks. We believe that they 
merely allow for them. Nevertheless, the suggestion that a relationship exists between prolines, kinks, 
and the mobility that is necessary for GPCR activation continues to be a prominent part of 
experimental studies. However, our structural comparisons between GPCRs in the active and inactive 
form do not fully support this suggestion, as is illustrated in Figure 8, in which the active form of the 
(32-adrenoceptor is superposed on its inactive form. Figure 8 illustrates that the structural differences 
observed in the helices are not related one-to-one to the presence of prolines. The major hinge point is 
located near the kink in helix VI, but this helix does not bend in the middle; it undergoes a full-helix 
rotation. The blue and red helix VI in Figure 8 can be superposed on all 28 Ca atoms (including the 
residues in the irregular area near the ligand binding site) with an RMS displacement of only 0.9 A. 

In every kinked helix, a highly conserved proline is found near the kink. Whenever there is a 
non-helical element in the middle of a helix, a proline starts the "rest" of that helix. These non-helical 
elements can take the form of an a-bulge or a short 3io helix. Our supposition is that prolines do not 
induce kinks, as has often been suggested, but merely facilitate the non-helical element in the middle 
of the helix and ensure that the transmembrane domain can, after the irregularity, continue as a normal 
a-helix. In Figure 8 we see that for all cases the irregularities are highly similar in the active and 
inactive form, except for helix VII. In the inactive state this helix has a large stretch of residues 
forming a highly irregular helix, whereas in the active state the helix is more regular. The fact that 
helix VI rotates in its entirety rather than bending at the kink near the proline further adds to the idea 
that the relationship between prolines and kinks is not as simple and direct as has often been suggested. 
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Figure 7. Helix I extracted from a structural alignment of 69 structures in YASARA's [44] 
Ca trace representation. The structures from which the TV-terminal domain was removed 
(and sometimes also a small part of the extracellular end of this helix truncated), display an 
outward displacement of helix I away from the transmembrane helix bundle. 



Figure 8. P-2 adrenoceptor in the active state (PDBid = 3sn6 [11]; red) superposed on the 
inactive state (PDBid = 3ny8 [12]; cyan). The superposition was performed with the 
WHAT IF superposition module and involved 239 residues that matched with an RMS Ca 
displacement of 1.34 A. Ligands, sugars, G proteins, etc. are not shown for clarity. Prolines 
are colored green. The major differences observed are in helices V and VI in the lower left 
of the figure. In the active form, helix V is much longer and the cytosolic half of helix VI is 
rotated outwards by about 30 degrees (please note that this involves a rotation of the entire 
helix, not just the cytosolic half). Further large differences are seen in helix VII (in the 
centre) and in the corner between helices VII and VIII. The different orientation of helix I 
(right most helix) in the two structures is most likely caused by crystal packing artifacts. 
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Table 1 lists the a-bulges observed in the seven transmembrane helices in the 84 GPCRs studied. 
Figure 9 shows the distribution of secondary structure types over all the residues that these 
69 structures have in common. The often mentioned [24-30] a-bulges in helices II and V are present 
in nearly all GPCRs, but they are not as highly conserved as some other features, and their locations 
are not conserved. Especially in helix V, many more irregularities are observed — and the SIP-lipid 
receptors lack the a-bulge in helix V. Interestingly, this receptor does not have the otherwise conserved 
proline at position 520. This finding might even indicate an alternative way for the lipid ligand to 
enter the active site, because the flexible ligand can perhaps enter directly from the membrane between 
helices IV and V [18]. Figure 10 illustrates the variability of the bulges observed in helix V. 

Figure 9. Secondary structure distribution for residues that are common to the 
69 structures used in this study. Blue: a-helix; yellow: a-bulge/7i-helix; purple: 3io helix; 
orange: loop, strand, and turn. Each vertical bar is 69 residues high and the fraction of 
the bar in a certain color corresponds to the fraction of residues with the corresponding 
secondary structure. Secondary structures were determined with DSSP 2.0. The plot 
contains all transmembrane residues plus a few residues into the loop areas that all 
69 structures have in common. In most a-bulge areas, one residue is therefore not counted. 
Small white bars represent the elements between the transmembrane regions that are not 
structurally conserved throughout the 69 structures. The central part of helix VII is either 
a regular helix, or consists of a stretch of 3io helix combined with a bulge or similar 
irregularity. Despite bulges and 3io stretches, the part of helix VII that ends with the 
conserved NP motif at positions 729 and 730 always has equally many residues so that no 
residue numbering differences can be observed between receptors. 
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Table 1. Occurrence of a-bulges in the seven helices in the 84 GPCRs. Most a-bulges 
occur in helix II and helix V (near the ligand-binding pocket). The lipid receptors contain 
no a-bulges. a-Bulges were determined with DSSP 2.0. Helix VII generally contains a 
stretch of 3/10 helix near sequence position 720-725 (see Figure 9. The numbers 0, 1, 
and 2 indicate that we observed zero, one, or two bulges in that helix, respectively. In a 
few cases one extra turn of 3/10 helix is observed near the cytosolic end of Helix VII; 
these are indicated with -1. * Bulge in Helix V of the A2A receptors is twice as long as in 
other receptors. 
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Figure 10. The area around the bulge in helix- V. The SIP lipid receptor (orange, 
PDBid = 3v2y [45]) does not have a-bulges in helix V and is given as a reference. 
Rhodopsin (magenta, PDBid = lf88 [2]) and the adenosine-2A receptor (blue, 
PDBid = 3eml [46]) have an a-bulge (A) between positions 516 and 517; and the 
adenosine-2A receptor has an extra bulge (B) between positions 511 and 512. Side chains 
of residues at position 520 are shown. Rhodopsin and the adenosine-2A receptor have a 
proline at position 520. The SIP lipid receptor, which does not have bulges in helix V, 
does not have a proline at position 520. Time will tell if this correlation is accidental 
or causal. 




Most structures have an a-bulge in the middle of helix II, notable exceptions being opioid, CXCR4, 
CCR5, and lipid receptors. Visual inspection of structures containing the bulge in helix II does not 
suggest a functional role for it despite that it is located at the same depth in the membrane as the ligand 
binding site. Squid rhodopsin (PDBid = 2z73 [47]) has an a-bulge at the extracellular side of helix II 
that is absent in bovine rhodopsin. 

The a-bulge after Tyr733 is present in all GPCR structures except the (31 and (32 adrenoceptors. 
The absence of this bulge in (31 and (32 adrenoceptors is correlated with the presence of a proline at 
location 808 (in the corner between helices VII and VIII). It therefore appears that both the bulge and 
the proline can facilitate a similar displacement of Tyr733. Figure 11 illustrates these effects. This 
bulge may be "needed" to allow tyrosine 733 to bend inwards and assist in stabilization of G protein 
binding to the GPCR. The word "needed" should be read with caution, as there is no indication of a 
causal relationship between the two observations. 

Figure 12 shows the bulges in helix IV in the CXCR4 chemokine receptor (PDBid = 3odu [48]) 
and the M2 muscarinic acetylcholine receptor (PDBid = 3uon [49]). At present, it is not possible to 
evidentially associate these bulges with any particular function, although it could be speculated that 
they play a role in dimer formation. 
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Figure 11. Structural variability in the intracellular part of helix VII near residue Tyr733. 
In all four panels the cytosolic side of helix VII and the beginning of helix VIII are shown 
as a trace model and the Tyr733 side chain is shown as a stick model. In all four panels 
helix VIII points to the right. (A) Cyan represents inactive rhodopsin (PDBid = lf88 [2]), 
blue represents inactive (32 adrenoceptor (PDBid = 3ny9 [12]). After Tyr733, rhodopsin 
forms a normal helix and the (32 adrenoceptor forms a 3io helix, which is followed by a 
proline at the beginning of helix VIII. In the Panels B-D, cyan represents the inactive state, 
red represents the active state; (B) Adenosine A2A receptor. Cyan: PDBid = 3eml [46]. 
Red: PDBid = 2ydo [14]; (C) (32 adrenoceptor. Cyan: PDBid = 3ny9 [12]. Red: 
PDBid = 3sn6 [11]; and (D) Rhodopsin. Cyan: PDBid = lf88 [2]. Red: PDBid = 3cap [50]. 
In rhodopsin and the adenosine A2A receptor, a bulge is formed upon activation. 
In the (32 adrenoceptor the inactive state has a 3io helix, which becomes a normal helix 
upon activation. 
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Figure 12. Bulges in helix IV. Left: CXCR4 chemokine receptor (blue; PDBid = 3odu [48]) 
superposed on the delta opioid receptor (cyan; PDBid = 4ej4 [51]) as reference. 
The CXCR4 chemokine receptor has an a-bulge between positions 418 and 419; and 
Right: M2 muscarinic acetylcholine receptor (red; PDBid = 3uon [49]) superposed on the 
dopamine D3 receptor (cyan; PDBid = 3pbl [52]) as a reference. The M2 muscarinic 
receptor has an a-bulge between positions 428 and 429. 




For the adenosine 2A receptor, structures are available bound to inverse agonists, antagonists, 
and agonists. Unfortunately, no structure is available yet for this receptor with a bound G protein. 
Figure 13 shows the superposition of the ten available structures for the adenosine 2 A receptor and 
indicates that in most cases the differences between structures with a bound inverse agonist and 
structures with a bound antagonist are small. However, the structural differences are larger when 
either of these two groups is compared to structures with a bound agonist. Most of these structural 
differences seem to agree with the clothespin mechanism [53] but the displacements seen for the 
cytosolic side of helix V are surprising. When going from a bound inverse agonist to an antagonist we 
observe an inward motion of helix V, and upon binding of the agonist we observe an outward motion. 
We have no explanation yet for this counter-intuitive phenomenon. 

Figure 13 shows differences in the helices observed in globally superposed structures. Figure 13 
shows the same helices in the same colors with the helices superposed locally. This local superposition 
reveals that helix VI rotates as a whole rather than its cytosolic side bending outwards (see Figure 14). 
Helix III becomes more regular and less bent upon binding an agonist. Helix V winds up more tightly 
going from the inverse agonist, via the antagonist, to the agonist bound structures. 

Many structures are available for the (32 adrenoceptor family, including structures with bound 
agonists, antagonists, and inverse agonists. We also have a structure for a bound G protein trimer, and 
a structure with a nanobody bound in such a way that similar displacements are observed. This 
diversity of (32 adrenoceptor structures allows an in-depth study of how they move during different 
phases of the activation process. Figure 15 shows differences in the helices observed in globally 
superposed structures. 
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Figure 13. (A) Eleven adenosine 2 A structures superposed. Light blue: 3vg9 [21] and 
3vga [21] that each bind an inverse agonist; brown: 3eml [46], 3pwh [17], 3rey [17], 
3rfm [17], 3uza [54] and 3uzc [54] that each bind an antagonist; red: 2ydo [14], 2ydv [14] 
and 3qak [55] that each bind an agonist. At some locations these three groups show 
systematic behavior that is illustrated in the blow-up of three representative structures 
(3vga, 3eml and 2ydo) in the panels B-E; (B) Helix II shows a systematic displacement of 
the area around the a-bulge towards helix III in the activated receptors; (C) Helix V in the 
activated receptors shows a systematic displacement of the area around the a-bulge 
towards helices III and IV; (D) Relative to the inverse agonist bound structures (cyan), the 
cytosolic side of helix V moves outward when an agonist is bound and inward when an 
antagonist is bound; and (E) The loop between helix VII and helix VIII behaves 
systematically (albeit in a hard to describe way) as function of the type of ligand bound, in 
line with the crucial role in the activation process for mobility of Tyr733. 

a b c 
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Figure 14. Local superposition results of A2A adenosine receptor transmembrane helices. 
In this figure, helices are taken out of the structure and superposed without using the rest of 
the molecule. (A) Helix III becomes more regular upon activation. The side chain of one of 
Val-322 (known in many GPCRs to be crucial for binding the endogenous ligand) is 
indicated as a point of reference; (B) Helix IV is highly irregular at the cytosolic side. It is 
not clear if this is caused by the bound ligand or if it is caused by a crystal packing artifact; 
(C) Helix V is seen winding up more tightly going from inverse agonist, via antagonist, to 
agonist bound structures; (D) Helix VI neither tilts nor kinks upon activation. Instead, the 
entire helix rotates. A minor tightening of the cytosolic end of the helix is observed in the 
agonist bound form; and (E) Helix VII forms a bulge on the intracellular side upon 
activation while rotating toward the centre of the seven-helix bundle. 3qak shows a 
different Ca trace, which correlates with the absence of a structural water near 
this difference. 
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Figure 15. (A) Seven (32 adrenoceptor structures superposed. Light blue: 2rhl [56], 
3d4s [57], 3ny8 [12] and 3ny9 [12] that each bind an inverse agonist; orange: 3nya [12] 
that binds to an antagonist; red: 3p0g [15] and 3sn6 [12] that each bind an agonist in 
combination with a nanobody and a trimeric G protein respectively to the cytosolic side. 
At some locations, these three groups show a systematic behavior that is illustrated in 
Panels B-E, which show a blow-up of three representative structures (light blue: 3ny8 
bound to an inverse agonist; orange: 3nya bound to an antagonist; red: 3sn6 bound to 
an agonist and a trimeric G protein on the cytosolic side); (B) In helix II we observe a 
systematic motion of the area around the a-bulge towards helix III in the activated receptor; 
(C) In helix V, we observe a systematic motion of the area around the a-bulge towards 
helix III and helix IV in the activated receptor; (D) Relative to the inverse agonist bound 
structure (cyan) and the antagonist bound structure (orange), the cytosolic side of helix V 
moves outward when an agonist and a G-protein are bound as does helix VI; and 
(E) The loop between helix VII and helix VIII behaves systematically (albeit in a hard to 
describe way) as function of the type of ligand bound. 
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3. Discussion 

Over one hundred GPCR structures are available in the PDB. These structures relate to more than 
ten different GPCRs that have been solved with and without nanobodies or G proteins bound to them; 
with different ligands bound to them, and with different modifications to achieve crystallization. 
All these structures contain at least some artifacts due to crystallization. Binding a ligand or a 
G protein leads to structural changes such as helix displacements or rotamer flips in residues, but it is 
sometimes hard to separate these endogenous activation-related structure changes from those caused 
by crystallization artifacts. We used the variable importance score of the Random Forest (RF) method 
to elucidate which distances differ systematically as a function of motions associated with different 
phases of the GPCR activation process. However, the results of this approach cannot be taken at 
face-value, because we manually defined which structures were in the active or inactive state 
(G protein bound, or nanobody bound at the same location) and then used the RF method to determine 
which interatomic distances differ most systematically between the defined states. Similarly, 
a comparison of structures with bound agonists, antagonists, or inverse agonists is only as good as the 
determination of whether the bound compounds indeed are an agonist, etc. In addition, there is the risk 
that we may have missed a confounding variable — for example, the possibility that inverse agonists 
only bind to structures in which a certain mutation has been introduced. 

The comparison between active structures and inactive structures, and between structures with an 
agonist bound and structures with an inverse agonist bound, revealed several interesting systematic 
movements. The strongest systematically observed effect in GPCR activation is an inward motion of 
Tyr733. It has been hypothesized [2,8,34-36] that this residue stabilizes the active conformation by 
interacting with Arg340, thereby stabilizing the open conformation of the "ionic lock" comprising 
Arg340 and Glu600. The outward motion of the intracellular side of helix VI is also systematically 
related to stages of the activation process, and is more a rotation of the entire helix then it is a bending 
of the helix at the kink near W618 and P620. 

Compared to structures in the inactive state, structures in the active state show an upward (towards 
the outside of the cell) displacement of helix III combined with a hard to describe screw motion that 
makes the extracellular part of helix III a more regular a-helix. The extent of this upward displacement 
differs from case to case. 

Agonist binding brings helices V-VII closer to each other around the ligand-binding site. 
Activation and G-protein binding are associated with a displacement of the intracellular part of helix 
VI that creates the crevice in which the G protein binds. Helix VI, however, does not have a 
hinge-point where the bending angle of the helix changes. The helix itself stays rigid but it rotates in 
such a way that its cytosolic side moves outwards. Combined with the regularization of helix III, and 
the known constitutive activity of many GPCRs in the absence of agonists, we see similarity to a 
clothespin in the model for G protein activation. GPCRs are mostly in the inactive conformation 
(clothespin closed) and occasionally in the active conformation (clothespin open), in which state they 
can activate a G protein (catch the laundry). Agonist binding and G protein binding both stabilize the 
active conformation. The activating ligand keeps the extracellular sides of helices V-VII close together 
and straightens helix III, while also moving helix III upwards, towards the extracellular side. The other 
helices do not bend at their hinge points but maintain their same general local orientation. It could be 
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hypothesized that the upward movement of helix III makes it easier to break the ionic lock. During 
GPCR activation, Arg340 flips upwards, where it is maintained in position by interactions with Tyr733 
and Tyr528. This creates a crevice at the intracellular side of the receptor in which the C-terminal end 
of a G protein's a-subunit can dock. After binding to the activated GPCR, the G protein exchanges its 
bound GDP for a GTP and activates downstream pathways. An inverse agonist keeps the GPCR in its 
inactive state by pushing the extracellular sides of the helices outward, thus keeping the cytosolic sides 
of the helices close to one another. 

4. Methods 

GPCR structures were extracted from the PDB [58] and stripped of water, crystallization additives, 
multimeric partners, G proteins, cloned-in lysozymes, antibodies, etc. They were renumbered using the 
GPCRDB numbering scheme [59] which has the advantage of allocating identical residue numbers to 
homologous residues, making it easier for many computer programs to deal with them. For example, 
the Utopia-GPCRDB intelligent PDF-reader [60] directly couples GPCR-related articles to the 
GPCRDB [61] information system, making it very easy to place information extracted from an 
article in the wider context of GPCR knowledge. For example, this article can be read most 
productively using the Utopia-GPCRDB intelligent PDF-reader, which can be downloaded from the 
GPCRDB [61]. 

Unless mentioned otherwise, global structure superpositions were performed using the YASARA 
(YASARA Biosciences, Vienna, Austria) [44] implementation of the Mustang structure alignment 
program [62]. Local superpositions were produced with the WHAT IF [63] superposition module [64] 
that is integrated into YASARA. The superposition of single helices using only Ca atoms of 
corresponding residues (thus neglecting bulge residues when one superposition partner does not 
possess that bulge) was performed using a variant of the WHAT IF superposition module specifically 
adapted for the purpose. 

Analyses were performed using the randomForest module [65] in the R [66] package. Variable 
importances for predefined classes were determined using standard parameters and ntree = 1000, 
importance = TRUE. Variables were sorted with the G- m { importance parameter as sorting key. 

a-Bulges were detected with a rewritten version of DSSP [67]. This rewritten version, DSSP 2.0, 
produces »99.9% the same results as the now thirty-year old original DSSP 1.0 program written by 
Kabsch and Sander when a-bulges (also often called 7i-helices [26]) are not counted. Table 1 lists the 
structures used and their a-bulges. 

DSSP 2.0 is freely available through the Centre for Molecular Bacteriology and Infection (CMBI) 
structure facilities [68] web pages [69] or directly from the DSSP FTP site at [70]. All structures 
mentioned in this study, and an extensive study of the effects of mutations and crystallization additives 
on the structures, are available at [18]. This website also holds many methodological details that are 
beyond the scope of this article. 

5. Conclusions 

We have shown that a-bulges are a prominent feature of GPCRs and that the bulges in helices II 
and V are highly conserved, albeit varying in type, and in helix V also varying in location. We also 
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observed bulges in helices IV and VII. In several cases, the presence or absence of these bulges is 
directly linked to the activation process, but in other cases this conclusion awaits experimental 
validation. Although a-bulges have so far largely eluded the interest of the GPCR research community, 
we hope that this short summary of our observations will stimulate experiments aimed at shedding 
light on their precise functional role. 
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